Goto

Collaborating Authors

 humanlike agent


Researchers design virtual environment to spur development of helpful home robots

#artificialintelligence

Without much prior experience, kids can recognize other people's intentions and come up with plans to help them achieve their goals, even in novel scenarios. That's why researchers at MIT, Nvidia, and ETH Zurich developed Watch-And-Help (WAH), a challenge in which embodied AI agents need to understand goals by watching a demonstration of a human performing a task and coordinating with the human to solve the task as quickly as possible. The concept of embodied AI draws on embodied cognition, the theory that many features of psychology -- human or otherwise -- are shaped by aspects of the entire body of an organism. By applying this logic to AI, researchers hope to improve the performance of AI systems like chatbots, robots, autonomous vehicles, and even smart speakers that interact with their environments, people, and other AI. A truly embodied robot could check to see whether a door is locked, for instance, or retrieve a smartphone that's ringing in an upstairs bedroom.


Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration

arXiv.org Artificial Intelligence

In this paper, we introduce Watch-And-Help (WAH), a challenge for testing social intelligence in agents. In WAH, an AI agent needs to help a humanlike agent perform a complex household task efficiently. To succeed, the AI agent needs to i) understand the underlying goal of the task by watching a single demonstration of the humanlike agent performing the same task (social perception), and ii) coordinate with the humanlike agent to solve the task in an unseen environment as fast as possible (human-AI collaboration). For this challenge, we build VirtualHome-Social, a multi-agent household environment, and provide a benchmark including both planning and learning based baselines. We evaluate the performance of AI agents with the humanlike agent as well as with real humans using objective metrics and subjective user ratings. Experimental results demonstrate that the proposed challenge and virtual environment enable a systematic evaluation on the important aspects of machine social intelligence at scale. Without much prior experience, children can robustly recognize goals of other people by simply watching them act in an environment, and are able to come up with plans to help them, even in novel scenarios. In contrast, the most advanced AI systems to date still struggle with such basic social skills. In order to achieve the level of social intelligence required to effectively help humans, an AI agent should acquire two key abilities: i) social perception, i.e., the ability to understand human behavior, and ii) collaborative planning, i.e., the ability to reason about the physical environment and plan its actions to coordinate with humans. In this paper, we are interested in developing AI agents with these two abilities. Towards this goal, we introduce a new AI challenge, Watch-And-Help (WAH), which focuses on social perception and human-AI collaboration. In this challenge, an AI agent needs to collaborate with a humanlike agent to enable it to achieve the goal faster. In particular, we present a 2-stage framework as shown in Figure 1. In the first, Watch stage, an AI agent (Bob) watches a humanlike agent (Alice) performing a task once and infers Alice's goal from her actions.


Enhancing the Monte Carlo Tree Search Algorithm for Video Game Testing

arXiv.org Artificial Intelligence

In this paper, we study the effects of several Monte Carlo Tree Search (MCTS) modifications for video game testing. Although MCTS modifications are highly studied in game playing, their impacts on finding bugs are blank. We focused on bug finding in our previous study where we introduced synthetic and human-like test goals and we used these test goals in Sarsa and MCTS agents to find bugs. In this study, we extend the MCTS agent with several modifications for game testing purposes. Furthermore, we present a novel tree reuse strategy. We experiment with these modifications by testing them on three testbed games, four levels each, that contain 45 bugs in total. We use the General Video Game Artificial Intelligence (GVG-AI) framework to create the testbed games and collect 427 human tester trajectories using the GVG-AI framework. We analyze the proposed modifications in three parts: we evaluate their effects on bug finding performances of agents, we measure their success under two different computational budgets, and we assess their effects on human-likeness of the human-like agent. Our results show that MCTS modifications improve the bug finding performance of the agents.


Automated Video Game Testing Using Synthetic and Human-Like Agents

arXiv.org Artificial Intelligence

In this paper, we present a new methodology that employs tester agents to automate video game testing. We introduce two types of agents -synthetic and human-like- and two distinct approaches to create them. Our agents are derived from Reinforcement Learning (RL) and Monte Carlo Tree Search (MCTS) agents, but focus on finding defects. The synthetic agent uses test goals generated from game scenarios, and these goals are further modified to examine the effects of unintended game transitions. The human-like agent uses test goals extracted by our proposed multiple greedy-policy inverse reinforcement learning (MGP-IRL) algorithm from tester trajectories. MGPIRL captures multiple policies executed by human testers. These testers' aims are finding defects while interacting with the game to break it, which is considerably different from game playing. We present interaction states to model such interactions. We use our agents to produce test sequences, run the game with these sequences, and check the game for each run with an automated test oracle. We analyze the proposed method in two parts: we compare the success of human-like and synthetic agents in bug finding, and we evaluate the similarity between humanlike agents and human testers. We collected 427 trajectories from human testers using the General Video Game Artificial Intelligence (GVG-AI) framework and created three games with 12 levels that contain 45 bugs. Our experiments reveal that human-like and synthetic agents compete with human testers' bug finding performances. Moreover, we show that MGP-IRL increases the human-likeness of agents while improving the bug finding performance.